Overview

Dataset statistics

Number of variables18
Number of observations1465783
Missing cells1896895
Missing cells (%)7.2%
Duplicate rows15832
Duplicate rows (%)1.1%
Total size in memory1.4 GiB
Average record size in memory994.1 B

Variable types

Categorical7
Text8
Numeric3

Alerts

Dataset has 15832 (1.1%) duplicate rowsDuplicates
PresentState is highly imbalanced (95.2%)Imbalance
PermanentAddress is highly imbalanced (> 99.9%)Imbalance
PermanentState is highly imbalanced (95.2%)Imbalance
InjuryType has 456136 (31.1%) missing valuesMissing
Injury_Nature has 1438422 (98.1%) missing valuesMissing
age has 70740 (4.8%) zerosZeros

Reproduction

Analysis started2024-04-14 05:57:28.324118
Analysis finished2024-04-14 06:00:25.380735
Duration2 minutes and 57.06 seconds
Software versionydata-profiling vv4.7.0
Download configurationconfig.json

Variables

District_Name
Categorical

Distinct41
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size94.7 MiB
Bengaluru City
293958 
Tumakuru
 
66879
Hassan
 
64399
Belagavi Dist
 
60516
Bengaluru Dist
 
60000
Other values (36)
920031 

Length

Max length23
Median length18
Mean length10.744256
Min length3

Characters and Unicode

Total characters15748748
Distinct characters40
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowBagalkot
2nd rowBagalkot
3rd rowBagalkot
4th rowBagalkot
5th rowBagalkot

Common Values

ValueCountFrequency (%)
Bengaluru City 293958
20.1%
Tumakuru 66879
 
4.6%
Hassan 64399
 
4.4%
Belagavi Dist 60516
 
4.1%
Bengaluru Dist 60000
 
4.1%
Shivamogga 58342
 
4.0%
Mandya 54474
 
3.7%
Chitradurga 46664
 
3.2%
Mysuru Dist 45672
 
3.1%
Davanagere 41943
 
2.9%
Other values (31) 672936
45.9%

Length

2024-04-14T11:30:25.709400image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
city 406583
19.2%
bengaluru 353959
16.7%
dist 166188
 
7.8%
belagavi 80061
 
3.8%
mysuru 76206
 
3.6%
tumakuru 66879
 
3.2%
hassan 64399
 
3.0%
shivamogga 58342
 
2.8%
kannada 54478
 
2.6%
mandya 54474
 
2.6%
Other values (33) 738218
34.8%

Most occurring characters

ValueCountFrequency (%)
a 2745052
17.4%
u 1428766
 
9.1%
r 1138016
 
7.2%
i 1131101
 
7.2%
g 907428
 
5.8%
l 764968
 
4.9%
n 761944
 
4.8%
t 718336
 
4.6%
654004
 
4.2%
y 597325
 
3.8%
Other values (30) 4901808
31.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 15748748
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 2745052
17.4%
u 1428766
 
9.1%
r 1138016
 
7.2%
i 1131101
 
7.2%
g 907428
 
5.8%
l 764968
 
4.9%
n 761944
 
4.8%
t 718336
 
4.6%
654004
 
4.2%
y 597325
 
3.8%
Other values (30) 4901808
31.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 15748748
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 2745052
17.4%
u 1428766
 
9.1%
r 1138016
 
7.2%
i 1131101
 
7.2%
g 907428
 
5.8%
l 764968
 
4.9%
n 761944
 
4.8%
t 718336
 
4.6%
654004
 
4.2%
y 597325
 
3.8%
Other values (30) 4901808
31.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 15748748
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 2745052
17.4%
u 1428766
 
9.1%
r 1138016
 
7.2%
i 1131101
 
7.2%
g 907428
 
5.8%
l 764968
 
4.9%
n 761944
 
4.8%
t 718336
 
4.6%
654004
 
4.2%
y 597325
 
3.8%
Other values (30) 4901808
31.1%
Distinct1044
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size100.1 MiB
2024-04-14T11:30:26.257300image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length44
Median length30
Mean length14.637915
Min length3

Characters and Unicode

Total characters21456007
Distinct characters54
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowAmengad PS
2nd rowAmengad PS
3rd rowAmengad PS
4th rowAmengad PS
5th rowAmengad PS
ValueCountFrequency (%)
ps 1459100
40.0%
rural 157282
 
4.3%
traffic 131192
 
3.6%
town 81321
 
2.2%
crime 51770
 
1.4%
cen 50795
 
1.4%
nagar 39496
 
1.1%
women 24072
 
0.7%
south 17104
 
0.5%
layout 15292
 
0.4%
Other values (823) 1618665
44.4%
2024-04-14T11:30:27.179805image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3262872
15.2%
2211315
 
10.3%
S 1637267
 
7.6%
P 1521395
 
7.1%
r 1390312
 
6.5%
i 1008860
 
4.7%
l 910718
 
4.2%
n 887906
 
4.1%
u 865220
 
4.0%
e 682382
 
3.2%
Other values (44) 7077760
33.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 21456007
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 3262872
15.2%
2211315
 
10.3%
S 1637267
 
7.6%
P 1521395
 
7.1%
r 1390312
 
6.5%
i 1008860
 
4.7%
l 910718
 
4.2%
n 887906
 
4.1%
u 865220
 
4.0%
e 682382
 
3.2%
Other values (44) 7077760
33.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 21456007
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 3262872
15.2%
2211315
 
10.3%
S 1637267
 
7.6%
P 1521395
 
7.1%
r 1390312
 
6.5%
i 1008860
 
4.7%
l 910718
 
4.2%
n 887906
 
4.1%
u 865220
 
4.0%
e 682382
 
3.2%
Other values (44) 7077760
33.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 21456007
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 3262872
15.2%
2211315
 
10.3%
S 1637267
 
7.6%
P 1521395
 
7.1%
r 1390312
 
6.5%
i 1008860
 
4.7%
l 910718
 
4.2%
n 887906
 
4.1%
u 865220
 
4.0%
e 682382
 
3.2%
Other values (44) 7077760
33.0%

Year
Real number (ℝ)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.913
Minimum2016
Maximum2024
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.2 MiB
2024-04-14T11:30:27.602120image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2016
Q12018
median2020
Q32022
95-th percentile2023
Maximum2024
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.3788086
Coefficient of variation (CV)0.0011776787
Kurtosis-1.2089886
Mean2019.913
Median Absolute Deviation (MAD)2
Skewness-0.10328035
Sum2.9607542 × 109
Variance5.6587306
MonotonicityNot monotonic
2024-04-14T11:30:27.947021image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2023 227315
15.5%
2022 202292
13.8%
2021 177218
12.1%
2019 174211
11.9%
2018 170560
11.6%
2020 166943
11.4%
2017 162718
11.1%
2016 140729
9.6%
2024 43797
 
3.0%
ValueCountFrequency (%)
2016 140729
9.6%
2017 162718
11.1%
2018 170560
11.6%
2019 174211
11.9%
2020 166943
11.4%
2021 177218
12.1%
2022 202292
13.8%
2023 227315
15.5%
2024 43797
 
3.0%
ValueCountFrequency (%)
2024 43797
 
3.0%
2023 227315
15.5%
2022 202292
13.8%
2021 177218
12.1%
2020 166943
11.4%
2019 174211
11.9%
2018 170560
11.6%
2017 162718
11.1%
2016 140729
9.6%

Month
Real number (ℝ)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.3513944
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.2 MiB
2024-04-14T11:30:28.291232image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.5109914
Coefficient of variation (CV)0.55279065
Kurtosis-1.2476962
Mean6.3513944
Median Absolute Deviation (MAD)3
Skewness0.055712088
Sum9309766
Variance12.327061
MonotonicityNot monotonic
2024-04-14T11:30:28.619776image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1 137944
9.4%
2 134529
9.2%
3 131974
9.0%
12 124326
8.5%
5 122591
8.4%
6 120906
8.2%
10 118128
8.1%
11 116501
7.9%
8 115438
7.9%
7 115325
7.9%
Other values (2) 228121
15.6%
ValueCountFrequency (%)
1 137944
9.4%
2 134529
9.2%
3 131974
9.0%
4 114024
7.8%
5 122591
8.4%
6 120906
8.2%
7 115325
7.9%
8 115438
7.9%
9 114097
7.8%
10 118128
8.1%
ValueCountFrequency (%)
12 124326
8.5%
11 116501
7.9%
10 118128
8.1%
9 114097
7.8%
8 115438
7.9%
7 115325
7.9%
6 120906
8.2%
5 122591
8.4%
4 114024
7.8%
3 131974
9.0%

age
Real number (ℝ)

ZEROS 

Distinct113
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.333776
Minimum-19
Maximum665
Zeros70740
Zeros (%)4.8%
Negative14
Negative (%)< 0.1%
Memory size11.2 MiB
2024-04-14T11:30:28.948197image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-19
5-th percentile1
Q124
median33
Q345
95-th percentile64
Maximum665
Range684
Interquartile range (IQR)21

Descriptive statistics

Standard deviation16.607518
Coefficient of variation (CV)0.48370789
Kurtosis1.6627866
Mean34.333776
Median Absolute Deviation (MAD)11
Skewness0.22427884
Sum50325865
Variance275.80966
MonotonicityNot monotonic
2024-04-14T11:30:29.339334image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 70740
 
4.8%
35 64017
 
4.4%
30 61242
 
4.2%
45 56173
 
3.8%
40 54790
 
3.7%
25 48681
 
3.3%
28 48162
 
3.3%
32 43527
 
3.0%
38 38821
 
2.6%
50 38590
 
2.6%
Other values (103) 941040
64.2%
ValueCountFrequency (%)
-19 1
 
< 0.1%
-18 13
 
< 0.1%
0 70740
4.8%
1 2944
 
0.2%
2 2543
 
0.2%
3 3129
 
0.2%
4 3591
 
0.2%
5 3525
 
0.2%
6 3644
 
0.2%
7 3207
 
0.2%
ValueCountFrequency (%)
665 1
 
< 0.1%
448 1
 
< 0.1%
120 1
 
< 0.1%
110 2
 
< 0.1%
109 1
 
< 0.1%
105 3
 
< 0.1%
104 1
 
< 0.1%
103 3
 
< 0.1%
102 8
< 0.1%
101 3
 
< 0.1%

Caste
Text

Distinct989
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size91.2 MiB
2024-04-14T11:30:29.903491image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length52
Median length43
Mean length8.2739028
Min length3

Characters and Unicode

Total characters12127746
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowLingayath
2nd rowVOKKALIGA
3rd rowVOKKALIGA
4th rowVOKKALIGA
5th rowVOKKALIGA
ValueCountFrequency (%)
vokkaliga 335944
19.8%
lingayath 143937
 
8.5%
muslim 137765
 
8.1%
adi 111940
 
6.6%
karnataka 93169
 
5.5%
kuruba 44856
 
2.6%
achari 41325
 
2.4%
nayaka 38365
 
2.3%
lambani 31739
 
1.9%
brahmin 27141
 
1.6%
Other values (1069) 687604
40.6%
2024-04-14T11:30:30.934834image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 2497215
20.6%
K 1079867
 
8.9%
I 1027586
 
8.5%
L 855437
 
7.1%
R 524018
 
4.3%
G 518439
 
4.3%
O 495271
 
4.1%
M 481889
 
4.0%
V 479613
 
4.0%
U 365712
 
3.0%
Other values (45) 3802699
31.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 12127746
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 2497215
20.6%
K 1079867
 
8.9%
I 1027586
 
8.5%
L 855437
 
7.1%
R 524018
 
4.3%
G 518439
 
4.3%
O 495271
 
4.1%
M 481889
 
4.0%
V 479613
 
4.0%
U 365712
 
3.0%
Other values (45) 3802699
31.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 12127746
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 2497215
20.6%
K 1079867
 
8.9%
I 1027586
 
8.5%
L 855437
 
7.1%
R 524018
 
4.3%
G 518439
 
4.3%
O 495271
 
4.1%
M 481889
 
4.0%
V 479613
 
4.0%
U 365712
 
3.0%
Other values (45) 3802699
31.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 12127746
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 2497215
20.6%
K 1079867
 
8.9%
I 1027586
 
8.5%
L 855437
 
7.1%
R 524018
 
4.3%
G 518439
 
4.3%
O 495271
 
4.1%
M 481889
 
4.0%
V 479613
 
4.0%
U 365712
 
3.0%
Other values (45) 3802699
31.4%
Distinct190
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size94.7 MiB
2024-04-14T11:30:31.535167image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length41
Median length27
Mean length10.714786
Min length3

Characters and Unicode

Total characters15705551
Distinct characters56
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)< 0.1%

Sample

1st rowFarmer
2nd rowFarmer
3rd rowFarmer
4th rowFarmer
5th rowFarmer
ValueCountFrequency (%)
farmer 385365
18.3%
labourer 198976
 
9.5%
housewife 177718
 
8.5%
others 163045
 
7.8%
pi 150246
 
7.1%
specify 150246
 
7.1%
student 118573
 
5.6%
businessman 65816
 
3.1%
officer 52608
 
2.5%
police 51682
 
2.5%
Other values (226) 588355
28.0%
2024-04-14T11:30:32.395522image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2023553
 
12.9%
1829878
 
11.7%
r 1795060
 
11.4%
a 878460
 
5.6%
i 805060
 
5.1%
o 733922
 
4.7%
s 650939
 
4.1%
u 618190
 
3.9%
t 616093
 
3.9%
m 570605
 
3.6%
Other values (46) 5183791
33.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 15705551
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2023553
 
12.9%
1829878
 
11.7%
r 1795060
 
11.4%
a 878460
 
5.6%
i 805060
 
5.1%
o 733922
 
4.7%
s 650939
 
4.1%
u 618190
 
3.9%
t 616093
 
3.9%
m 570605
 
3.6%
Other values (46) 5183791
33.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 15705551
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2023553
 
12.9%
1829878
 
11.7%
r 1795060
 
11.4%
a 878460
 
5.6%
i 805060
 
5.1%
o 733922
 
4.7%
s 650939
 
4.1%
u 618190
 
3.9%
t 616093
 
3.9%
m 570605
 
3.6%
Other values (46) 5183791
33.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 15705551
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2023553
 
12.9%
1829878
 
11.7%
r 1795060
 
11.4%
a 878460
 
5.6%
i 805060
 
5.1%
o 733922
 
4.7%
s 650939
 
4.1%
u 618190
 
3.9%
t 616093
 
3.9%
m 570605
 
3.6%
Other values (46) 5183791
33.0%

Sex
Categorical

Distinct3
Distinct (%)< 0.1%
Missing227
Missing (%)< 0.1%
Memory size86.2 MiB
MALE
995123 
FEMALE
469964 
Enuch
 
469

Length

Max length6
Median length4
Mean length4.6416657
Min length4

Characters and Unicode

Total characters6802621
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFEMALE
2nd rowMALE
3rd rowMALE
4th rowMALE
5th rowMALE

Common Values

ValueCountFrequency (%)
MALE 995123
67.9%
FEMALE 469964
32.1%
Enuch 469
 
< 0.1%
(Missing) 227
 
< 0.1%

Length

2024-04-14T11:30:32.755425image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T11:30:33.115640image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
male 995123
67.9%
female 469964
32.1%
enuch 469
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
E 1935520
28.5%
M 1465087
21.5%
A 1465087
21.5%
L 1465087
21.5%
F 469964
 
6.9%
n 469
 
< 0.1%
u 469
 
< 0.1%
c 469
 
< 0.1%
h 469
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6802621
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 1935520
28.5%
M 1465087
21.5%
A 1465087
21.5%
L 1465087
21.5%
F 469964
 
6.9%
n 469
 
< 0.1%
u 469
 
< 0.1%
c 469
 
< 0.1%
h 469
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6802621
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 1935520
28.5%
M 1465087
21.5%
A 1465087
21.5%
L 1465087
21.5%
F 469964
 
6.9%
n 469
 
< 0.1%
u 469
 
< 0.1%
c 469
 
< 0.1%
h 469
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6802621
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 1935520
28.5%
M 1465087
21.5%
A 1465087
21.5%
L 1465087
21.5%
F 469964
 
6.9%
n 469
 
< 0.1%
u 469
 
< 0.1%
c 469
 
< 0.1%
h 469
 
< 0.1%
Distinct1034708
Distinct (%)70.6%
Missing0
Missing (%)0.0%
Memory size134.6 MiB
2024-04-14T11:30:34.916060image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length101
Median length80
Mean length39.296896
Min length1

Characters and Unicode

Total characters57600722
Distinct characters111
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique885717 ?
Unique (%)60.4%

Sample

1st rowHUVINAHALLI,TQ-HUANGUND
2nd rowBASAVA NAGAR GOKAK CTS 190/5 PLAT NO 2,GOKAK
3rd rowBASAVA NAGAR GOKAK CTS 190/5 PLAT NO 2,GOKAK
4th rowBASAVA NAGAR GOKAK CTS 190/5 PLAT NO 2,TQ-GOKAK
5th rowBASAVA NAGAR GOKAK CTS 190/5 PLAT NO 2,TQ-GOKAK
ValueCountFrequency (%)
tq 259100
 
3.8%
taluk 171865
 
2.5%
village 133494
 
2.0%
cross 111413
 
1.7%
no 110460
 
1.6%
r/o 84642
 
1.3%
83782
 
1.2%
town 81032
 
1.2%
main 79676
 
1.2%
road 75665
 
1.1%
Other values (735756) 5554425
82.3%
2024-04-14T11:30:36.776048image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5467165
 
9.5%
5381078
 
9.3%
A 3484201
 
6.0%
, 2404683
 
4.2%
l 2350038
 
4.1%
i 1936167
 
3.4%
r 1714304
 
3.0%
N 1536076
 
2.7%
T 1521154
 
2.6%
e 1518070
 
2.6%
Other values (101) 30287786
52.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 57600722
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 5467165
 
9.5%
5381078
 
9.3%
A 3484201
 
6.0%
, 2404683
 
4.2%
l 2350038
 
4.1%
i 1936167
 
3.4%
r 1714304
 
3.0%
N 1536076
 
2.7%
T 1521154
 
2.6%
e 1518070
 
2.6%
Other values (101) 30287786
52.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 57600722
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 5467165
 
9.5%
5381078
 
9.3%
A 3484201
 
6.0%
, 2404683
 
4.2%
l 2350038
 
4.1%
i 1936167
 
3.4%
r 1714304
 
3.0%
N 1536076
 
2.7%
T 1521154
 
2.6%
e 1518070
 
2.6%
Other values (101) 30287786
52.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 57600722
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 5467165
 
9.5%
5381078
 
9.3%
A 3484201
 
6.0%
, 2404683
 
4.2%
l 2350038
 
4.1%
i 1936167
 
3.4%
r 1714304
 
3.0%
N 1536076
 
2.7%
T 1521154
 
2.6%
e 1518070
 
2.6%
Other values (101) 30287786
52.6%
Distinct693
Distinct (%)< 0.1%
Missing105
Missing (%)< 0.1%
Memory size94.5 MiB
2024-04-14T11:30:37.151909image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length29
Median length23
Mean length10.614335
Min length3

Characters and Unicode

Total characters15557198
Distinct characters58
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)< 0.1%

Sample

1st rowBagalkot
2nd rowBagalkot
3rd rowBagalkot
4th rowBelagavi Dist
5th rowBelagavi Dist
ValueCountFrequency (%)
city 410429
19.5%
bengaluru 360518
17.1%
dist 159706
 
7.6%
belagavi 78240
 
3.7%
mysuru 75604
 
3.6%
hassan 63267
 
3.0%
tumakuru 62450
 
3.0%
shivamogga 58104
 
2.8%
mandya 52865
 
2.5%
kannada 50500
 
2.4%
Other values (683) 738144
35.0%
2024-04-14T11:30:37.855562image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2618344
16.8%
u 1430994
 
9.2%
r 1136119
 
7.3%
i 1124739
 
7.2%
g 890011
 
5.7%
l 792070
 
5.1%
n 747638
 
4.8%
t 709241
 
4.6%
644195
 
4.1%
y 579936
 
3.7%
Other values (48) 4883911
31.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 15557198
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 2618344
16.8%
u 1430994
 
9.2%
r 1136119
 
7.3%
i 1124739
 
7.2%
g 890011
 
5.7%
l 792070
 
5.1%
n 747638
 
4.8%
t 709241
 
4.6%
644195
 
4.1%
y 579936
 
3.7%
Other values (48) 4883911
31.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 15557198
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 2618344
16.8%
u 1430994
 
9.2%
r 1136119
 
7.3%
i 1124739
 
7.2%
g 890011
 
5.7%
l 792070
 
5.1%
n 747638
 
4.8%
t 709241
 
4.6%
644195
 
4.1%
y 579936
 
3.7%
Other values (48) 4883911
31.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 15557198
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 2618344
16.8%
u 1430994
 
9.2%
r 1136119
 
7.3%
i 1124739
 
7.2%
g 890011
 
5.7%
l 792070
 
5.1%
n 747638
 
4.8%
t 709241
 
4.6%
644195
 
4.1%
y 579936
 
3.7%
Other values (48) 4883911
31.4%

PresentState
Categorical

IMBALANCE 

Distinct38
Distinct (%)< 0.1%
Missing948
Missing (%)0.1%
Memory size92.3 MiB
Karnataka
1428929 
Maharashtra
 
8441
Andhra pradesh
 
8063
Tamilnadu
 
5099
Kerala
 
3220
Other values (33)
 
11083

Length

Max length20
Median length9
Mean length9.0309359
Min length3

Characters and Unicode

Total characters13228831
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKarnataka
2nd rowKarnataka
3rd rowKarnataka
4th rowKarnataka
5th rowKarnataka

Common Values

ValueCountFrequency (%)
Karnataka 1428929
97.5%
Maharashtra 8441
 
0.6%
Andhra pradesh 8063
 
0.6%
Tamilnadu 5099
 
0.3%
Kerala 3220
 
0.2%
Telangana 1757
 
0.1%
Uttar pradesh 1435
 
0.1%
West bengal 1303
 
0.1%
Bihar 1263
 
0.1%
Rajasthan 712
 
< 0.1%
Other values (28) 4613
 
0.3%
(Missing) 948
 
0.1%

Length

2024-04-14T11:30:38.293078image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
karnataka 1428929
96.8%
pradesh 10136
 
0.7%
maharashtra 8441
 
0.6%
andhra 8063
 
0.5%
tamilnadu 5099
 
0.3%
kerala 3220
 
0.2%
telangana 1757
 
0.1%
uttar 1435
 
0.1%
west 1303
 
0.1%
bengal 1303
 
0.1%
Other values (38) 6868
 
0.5%

Most occurring characters

ValueCountFrequency (%)
a 5802330
43.9%
r 1472350
 
11.1%
n 1448958
 
11.0%
t 1443416
 
10.9%
K 1432149
 
10.8%
k 1429464
 
10.8%
h 40015
 
0.3%
d 24458
 
0.2%
s 22951
 
0.2%
e 18173
 
0.1%
Other values (35) 94567
 
0.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 13228831
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 5802330
43.9%
r 1472350
 
11.1%
n 1448958
 
11.0%
t 1443416
 
10.9%
K 1432149
 
10.8%
k 1429464
 
10.8%
h 40015
 
0.3%
d 24458
 
0.2%
s 22951
 
0.2%
e 18173
 
0.1%
Other values (35) 94567
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 13228831
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 5802330
43.9%
r 1472350
 
11.1%
n 1448958
 
11.0%
t 1443416
 
10.9%
K 1432149
 
10.8%
k 1429464
 
10.8%
h 40015
 
0.3%
d 24458
 
0.2%
s 22951
 
0.2%
e 18173
 
0.1%
Other values (35) 94567
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 13228831
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 5802330
43.9%
r 1472350
 
11.1%
n 1448958
 
11.0%
t 1443416
 
10.9%
K 1432149
 
10.8%
k 1429464
 
10.8%
h 40015
 
0.3%
d 24458
 
0.2%
s 22951
 
0.2%
e 18173
 
0.1%
Other values (35) 94567
 
0.7%

PermanentAddress
Categorical

IMBALANCE 

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size81.1 MiB
,
1465770 
CHINCHALAKATTI LT,KERUR TQ: BADAMI
 
1
Rakam Karnali Garama Dehakh Dist Behari Zone,NEpal
 
1
na,na
 
1
No,No
 
1
Other values (9)
 
9

Length

Max length70
Median length1
Mean length1.000307
Min length1

Characters and Unicode

Total characters1466233
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st row,
2nd row,
3rd row,
4th row,
5th row,

Common Values

ValueCountFrequency (%)
, 1465770
> 99.9%
CHINCHALAKATTI LT,KERUR TQ: BADAMI 1
 
< 0.1%
Rakam Karnali Garama Dehakh Dist Behari Zone,NEpal 1
 
< 0.1%
na,na 1
 
< 0.1%
No,No 1
 
< 0.1%
Kailori village, Ronija post,Nadavai taluk 1
 
< 0.1%
NAGALYANDA,NAGALYANDA 1
 
< 0.1%
Naduvil Villlage, Podomadattil Mandala Post,,Kannur 1
 
< 0.1%
KITHANUR VILLAGE BIDARAHALLI(H),BANGALORE EAST TQ 1
 
< 0.1%
HANUR(V) KOLLEGALA ,CH N 1
 
< 0.1%
Other values (4) 4
 
< 0.1%

Length

2024-04-14T11:30:38.777577image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1465770
> 99.9%
tq 3
 
< 0.1%
nagara 2
 
< 0.1%
village 2
 
< 0.1%
thota 1
 
< 0.1%
hanur(v 1
 
< 0.1%
kollegala 1
 
< 0.1%
ch 1
 
< 0.1%
n 1
 
< 0.1%
mudigere,mudigere 1
 
< 0.1%
Other values (40) 40
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
, 1465789
> 99.9%
a 49
 
< 0.1%
40
 
< 0.1%
A 27
 
< 0.1%
N 17
 
< 0.1%
l 17
 
< 0.1%
i 16
 
< 0.1%
o 16
 
< 0.1%
r 15
 
< 0.1%
n 15
 
< 0.1%
Other values (50) 232
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1466233
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
, 1465789
> 99.9%
a 49
 
< 0.1%
40
 
< 0.1%
A 27
 
< 0.1%
N 17
 
< 0.1%
l 17
 
< 0.1%
i 16
 
< 0.1%
o 16
 
< 0.1%
r 15
 
< 0.1%
n 15
 
< 0.1%
Other values (50) 232
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1466233
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
, 1465789
> 99.9%
a 49
 
< 0.1%
40
 
< 0.1%
A 27
 
< 0.1%
N 17
 
< 0.1%
l 17
 
< 0.1%
i 16
 
< 0.1%
o 16
 
< 0.1%
r 15
 
< 0.1%
n 15
 
< 0.1%
Other values (50) 232
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1466233
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
, 1465789
> 99.9%
a 49
 
< 0.1%
40
 
< 0.1%
A 27
 
< 0.1%
N 17
 
< 0.1%
l 17
 
< 0.1%
i 16
 
< 0.1%
o 16
 
< 0.1%
r 15
 
< 0.1%
n 15
 
< 0.1%
Other values (50) 232
 
< 0.1%
Distinct693
Distinct (%)< 0.1%
Missing105
Missing (%)< 0.1%
Memory size94.5 MiB
2024-04-14T11:30:39.137412image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length29
Median length23
Mean length10.614335
Min length3

Characters and Unicode

Total characters15557198
Distinct characters58
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)< 0.1%

Sample

1st rowBagalkot
2nd rowBagalkot
3rd rowBagalkot
4th rowBelagavi Dist
5th rowBelagavi Dist
ValueCountFrequency (%)
city 410429
19.5%
bengaluru 360518
17.1%
dist 159706
 
7.6%
belagavi 78240
 
3.7%
mysuru 75604
 
3.6%
hassan 63267
 
3.0%
tumakuru 62450
 
3.0%
shivamogga 58104
 
2.8%
mandya 52865
 
2.5%
kannada 50500
 
2.4%
Other values (683) 738144
35.0%
2024-04-14T11:30:39.953303image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2618344
16.8%
u 1430994
 
9.2%
r 1136119
 
7.3%
i 1124739
 
7.2%
g 890011
 
5.7%
l 792070
 
5.1%
n 747638
 
4.8%
t 709241
 
4.6%
644195
 
4.1%
y 579936
 
3.7%
Other values (48) 4883911
31.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 15557198
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 2618344
16.8%
u 1430994
 
9.2%
r 1136119
 
7.3%
i 1124739
 
7.2%
g 890011
 
5.7%
l 792070
 
5.1%
n 747638
 
4.8%
t 709241
 
4.6%
644195
 
4.1%
y 579936
 
3.7%
Other values (48) 4883911
31.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 15557198
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 2618344
16.8%
u 1430994
 
9.2%
r 1136119
 
7.3%
i 1124739
 
7.2%
g 890011
 
5.7%
l 792070
 
5.1%
n 747638
 
4.8%
t 709241
 
4.6%
644195
 
4.1%
y 579936
 
3.7%
Other values (48) 4883911
31.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 15557198
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 2618344
16.8%
u 1430994
 
9.2%
r 1136119
 
7.3%
i 1124739
 
7.2%
g 890011
 
5.7%
l 792070
 
5.1%
n 747638
 
4.8%
t 709241
 
4.6%
644195
 
4.1%
y 579936
 
3.7%
Other values (48) 4883911
31.4%

PermanentState
Categorical

IMBALANCE 

Distinct38
Distinct (%)< 0.1%
Missing948
Missing (%)0.1%
Memory size92.3 MiB
Karnataka
1428929 
Maharashtra
 
8441
Andhra pradesh
 
8063
Tamilnadu
 
5099
Kerala
 
3220
Other values (33)
 
11083

Length

Max length20
Median length9
Mean length9.0309359
Min length3

Characters and Unicode

Total characters13228831
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKarnataka
2nd rowKarnataka
3rd rowKarnataka
4th rowKarnataka
5th rowKarnataka

Common Values

ValueCountFrequency (%)
Karnataka 1428929
97.5%
Maharashtra 8441
 
0.6%
Andhra pradesh 8063
 
0.6%
Tamilnadu 5099
 
0.3%
Kerala 3220
 
0.2%
Telangana 1757
 
0.1%
Uttar pradesh 1435
 
0.1%
West bengal 1303
 
0.1%
Bihar 1263
 
0.1%
Rajasthan 712
 
< 0.1%
Other values (28) 4613
 
0.3%
(Missing) 948
 
0.1%

Length

2024-04-14T11:30:40.403932image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
karnataka 1428929
96.8%
pradesh 10136
 
0.7%
maharashtra 8441
 
0.6%
andhra 8063
 
0.5%
tamilnadu 5099
 
0.3%
kerala 3220
 
0.2%
telangana 1757
 
0.1%
uttar 1435
 
0.1%
west 1303
 
0.1%
bengal 1303
 
0.1%
Other values (38) 6868
 
0.5%

Most occurring characters

ValueCountFrequency (%)
a 5802330
43.9%
r 1472350
 
11.1%
n 1448958
 
11.0%
t 1443416
 
10.9%
K 1432149
 
10.8%
k 1429464
 
10.8%
h 40015
 
0.3%
d 24458
 
0.2%
s 22951
 
0.2%
e 18173
 
0.1%
Other values (35) 94567
 
0.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 13228831
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 5802330
43.9%
r 1472350
 
11.1%
n 1448958
 
11.0%
t 1443416
 
10.9%
K 1432149
 
10.8%
k 1429464
 
10.8%
h 40015
 
0.3%
d 24458
 
0.2%
s 22951
 
0.2%
e 18173
 
0.1%
Other values (35) 94567
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 13228831
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 5802330
43.9%
r 1472350
 
11.1%
n 1448958
 
11.0%
t 1443416
 
10.9%
K 1432149
 
10.8%
k 1429464
 
10.8%
h 40015
 
0.3%
d 24458
 
0.2%
s 22951
 
0.2%
e 18173
 
0.1%
Other values (35) 94567
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 13228831
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 5802330
43.9%
r 1472350
 
11.1%
n 1448958
 
11.0%
t 1443416
 
10.9%
K 1432149
 
10.8%
k 1429464
 
10.8%
h 40015
 
0.3%
d 24458
 
0.2%
s 22951
 
0.2%
e 18173
 
0.1%
Other values (35) 94567
 
0.7%
Distinct88
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size86.7 MiB
2024-04-14T11:30:40.748122image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length30
Median length5
Mean length5.0020139
Min length4

Characters and Unicode

Total characters7331847
Distinct characters51
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)< 0.1%

Sample

1st rowIndia
2nd rowIndia
3rd rowIndia
4th rowIndia
5th rowIndia
ValueCountFrequency (%)
india 1464637
99.9%
indonesia 238
 
< 0.1%
nepal 192
 
< 0.1%
iran 84
 
< 0.1%
haiti 70
 
< 0.1%
bangladesh 68
 
< 0.1%
thailand 49
 
< 0.1%
macedonia 36
 
< 0.1%
united 33
 
< 0.1%
uganda 30
 
< 0.1%
Other values (91) 480
 
< 0.1%
2024-04-14T11:30:41.451225image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1466034
20.0%
n 1465655
20.0%
i 1465362
20.0%
d 1465191
20.0%
I 1465027
20.0%
e 818
 
< 0.1%
s 420
 
< 0.1%
l 406
 
< 0.1%
o 374
 
< 0.1%
r 290
 
< 0.1%
Other values (41) 2270
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7331847
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 1466034
20.0%
n 1465655
20.0%
i 1465362
20.0%
d 1465191
20.0%
I 1465027
20.0%
e 818
 
< 0.1%
s 420
 
< 0.1%
l 406
 
< 0.1%
o 374
 
< 0.1%
r 290
 
< 0.1%
Other values (41) 2270
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7331847
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 1466034
20.0%
n 1465655
20.0%
i 1465362
20.0%
d 1465191
20.0%
I 1465027
20.0%
e 818
 
< 0.1%
s 420
 
< 0.1%
l 406
 
< 0.1%
o 374
 
< 0.1%
r 290
 
< 0.1%
Other values (41) 2270
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7331847
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 1466034
20.0%
n 1465655
20.0%
i 1465362
20.0%
d 1465191
20.0%
I 1465027
20.0%
e 818
 
< 0.1%
s 420
 
< 0.1%
l 406
 
< 0.1%
o 374
 
< 0.1%
r 290
 
< 0.1%
Other values (41) 2270
 
< 0.1%

PersonType
Categorical

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size91.7 MiB
Injured
664146 
complainnant
453219 
Missing
147206 
Deceased
109167 
Others
 
56198
Other values (6)
 
35847

Length

Max length22
Median length7
Mean length8.6119917
Min length4

Characters and Unicode

Total characters12623311
Distinct characters31
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDeceased
2nd rowInjured
3rd rowInjured
4th rowInjured
5th rowInjured

Common Values

ValueCountFrequency (%)
Injured 664146
45.3%
complainnant 453219
30.9%
Missing 147206
 
10.0%
Deceased 109167
 
7.4%
Others 56198
 
3.8%
Kidnapped 24187
 
1.7%
Rape 9788
 
0.7%
Unidentified Dead Body 1121
 
0.1%
Unidentified Person 667
 
< 0.1%
Arrest 77
 
< 0.1%

Length

2024-04-14T11:30:41.888735image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
injured 664146
45.2%
complainnant 453219
30.9%
missing 147206
 
10.0%
deceased 109167
 
7.4%
others 56198
 
3.8%
kidnapped 24187
 
1.6%
rape 9788
 
0.7%
unidentified 1788
 
0.1%
dead 1121
 
0.1%
body 1121
 
0.1%
Other values (3) 751
 
0.1%

Most occurring characters

ValueCountFrequency (%)
n 2199446
17.4%
e 1087275
 
8.6%
a 1050701
 
8.3%
d 827512
 
6.6%
i 777182
 
6.2%
r 721186
 
5.7%
u 664153
 
5.3%
I 664146
 
5.3%
j 664146
 
5.3%
c 562386
 
4.5%
Other values (21) 3405178
27.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 12623311
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 2199446
17.4%
e 1087275
 
8.6%
a 1050701
 
8.3%
d 827512
 
6.6%
i 777182
 
6.2%
r 721186
 
5.7%
u 664153
 
5.3%
I 664146
 
5.3%
j 664146
 
5.3%
c 562386
 
4.5%
Other values (21) 3405178
27.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 12623311
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 2199446
17.4%
e 1087275
 
8.6%
a 1050701
 
8.3%
d 827512
 
6.6%
i 777182
 
6.2%
r 721186
 
5.7%
u 664153
 
5.3%
I 664146
 
5.3%
j 664146
 
5.3%
c 562386
 
4.5%
Other values (21) 3405178
27.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 12623311
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 2199446
17.4%
e 1087275
 
8.6%
a 1050701
 
8.3%
d 827512
 
6.6%
i 777182
 
6.2%
r 721186
 
5.7%
u 664153
 
5.3%
I 664146
 
5.3%
j 664146
 
5.3%
c 562386
 
4.5%
Other values (21) 3405178
27.0%

InjuryType
Categorical

MISSING 

Distinct5
Distinct (%)< 0.1%
Missing456136
Missing (%)31.1%
Memory size86.3 MiB
Fatal
349090 
Minor
315373 
Not Applicable
221248 
Grievous
122253 
Abused
 
1683

Length

Max length14
Median length5
Mean length7.3371277
Min length5

Characters and Unicode

Total characters7407909
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFatal
2nd rowFatal
3rd rowFatal
4th rowFatal
5th rowFatal

Common Values

ValueCountFrequency (%)
Fatal 349090
23.8%
Minor 315373
21.5%
Not Applicable 221248
15.1%
Grievous 122253
 
8.3%
Abused 1683
 
0.1%
(Missing) 456136
31.1%

Length

2024-04-14T11:30:42.250455image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T11:30:42.611872image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
fatal 349090
28.4%
minor 315373
25.6%
not 221248
18.0%
applicable 221248
18.0%
grievous 122253
 
9.9%
abused 1683
 
0.1%

Most occurring characters

ValueCountFrequency (%)
a 919428
12.4%
l 791586
10.7%
i 658874
 
8.9%
o 658874
 
8.9%
t 570338
 
7.7%
p 442496
 
6.0%
r 437626
 
5.9%
F 349090
 
4.7%
e 345184
 
4.7%
M 315373
 
4.3%
Other values (11) 1919040
25.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7407909
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 919428
12.4%
l 791586
10.7%
i 658874
 
8.9%
o 658874
 
8.9%
t 570338
 
7.7%
p 442496
 
6.0%
r 437626
 
5.9%
F 349090
 
4.7%
e 345184
 
4.7%
M 315373
 
4.3%
Other values (11) 1919040
25.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7407909
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 919428
12.4%
l 791586
10.7%
i 658874
 
8.9%
o 658874
 
8.9%
t 570338
 
7.7%
p 442496
 
6.0%
r 437626
 
5.9%
F 349090
 
4.7%
e 345184
 
4.7%
M 315373
 
4.3%
Other values (11) 1919040
25.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7407909
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 919428
12.4%
l 791586
10.7%
i 658874
 
8.9%
o 658874
 
8.9%
t 570338
 
7.7%
p 442496
 
6.0%
r 437626
 
5.9%
F 349090
 
4.7%
e 345184
 
4.7%
M 315373
 
4.3%
Other values (11) 1919040
25.9%

Injury_Nature
Text

MISSING 

Distinct1291
Distinct (%)4.7%
Missing1438422
Missing (%)98.1%
Memory size45.6 MiB
2024-04-14T11:30:43.566084image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length50
Median length49
Mean length7.8716056
Min length1

Characters and Unicode

Total characters215375
Distinct characters101
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique785 ?
Unique (%)2.9%

Sample

1st rowNo Injury
2nd rowgrievous
3rd rowgrievous
4th rowMiner
5th rowGrievous
ValueCountFrequency (%)
minor 7792
22.2%
simple 4278
12.2%
grievous 3724
10.6%
injury 2532
 
7.2%
grevious 2383
 
6.8%
fatal 1464
 
4.2%
nature 1244
 
3.6%
in 1160
 
3.3%
head 800
 
2.3%
not 511
 
1.5%
Other values (773) 9145
26.1%
2024-04-14T11:30:45.366654image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 15628
 
7.3%
I 13807
 
6.4%
r 12174
 
5.7%
e 10552
 
4.9%
o 10210
 
4.7%
R 9892
 
4.6%
M 9393
 
4.4%
G 8881
 
4.1%
S 8837
 
4.1%
n 8416
 
3.9%
Other values (91) 107585
50.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 215375
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 15628
 
7.3%
I 13807
 
6.4%
r 12174
 
5.7%
e 10552
 
4.9%
o 10210
 
4.7%
R 9892
 
4.6%
M 9393
 
4.4%
G 8881
 
4.1%
S 8837
 
4.1%
n 8416
 
3.9%
Other values (91) 107585
50.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 215375
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 15628
 
7.3%
I 13807
 
6.4%
r 12174
 
5.7%
e 10552
 
4.9%
o 10210
 
4.7%
R 9892
 
4.6%
M 9393
 
4.4%
G 8881
 
4.1%
S 8837
 
4.1%
n 8416
 
3.9%
Other values (91) 107585
50.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 215375
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 15628
 
7.3%
I 13807
 
6.4%
r 12174
 
5.7%
e 10552
 
4.9%
o 10210
 
4.7%
R 9892
 
4.6%
M 9393
 
4.4%
G 8881
 
4.1%
S 8837
 
4.1%
n 8416
 
3.9%
Other values (91) 107585
50.0%

Interactions

2024-04-14T11:30:02.763901image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-14T11:29:58.159158image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-14T11:30:00.366238image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-14T11:30:03.514785image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-14T11:29:58.942198image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-14T11:30:01.138705image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-14T11:30:04.156638image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-14T11:29:59.646312image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-14T11:30:01.963649image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-04-14T11:30:06.564133image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-14T11:30:11.345452image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

District_NameUnitNameYearMonthageCasteProfessionSexPresentAddressPresentCityPresentStatePermanentAddressPermanentCityPermanentStateNationality_NamePersonTypeInjuryTypeInjury_Nature
0BagalkotAmengad PS2016114LingayathFarmerFEMALEHUVINAHALLI,TQ-HUANGUNDBagalkotKarnataka,BagalkotKarnatakaIndiaDeceasedFatalNaN
1BagalkotAmengad PS2016149VOKKALIGAFarmerMALEBASAVA NAGAR GOKAK CTS 190/5 PLAT NO 2,GOKAKBagalkotKarnataka,BagalkotKarnatakaIndiaInjuredFatalNaN
2BagalkotAmengad PS201610VOKKALIGAFarmerMALEBASAVA NAGAR GOKAK CTS 190/5 PLAT NO 2,GOKAKBagalkotKarnataka,BagalkotKarnatakaIndiaInjuredFatalNaN
3BagalkotAmengad PS2016134VOKKALIGAFarmerMALEBASAVA NAGAR GOKAK CTS 190/5 PLAT NO 2,TQ-GOKAKBelagavi DistKarnataka,Belagavi DistKarnatakaIndiaInjuredFatalNaN
4BagalkotAmengad PS2016136VOKKALIGAFarmerMALEBASAVA NAGAR GOKAK CTS 190/5 PLAT NO 2,TQ-GOKAKBelagavi DistKarnataka,Belagavi DistKarnatakaIndiaInjuredFatalNaN
5BagalkotAmengad PS2016160GANIGAHousewifeFEMALEAMBLIKOPPA,TQ-HUNGUNDBagalkotKarnataka,BagalkotKarnatakaIndiaDeceasedFatalNaN
6BagalkotAmengad PS2016140MUSLIMDriverMALEBELAGAVI,TQ-BELAGAVIBagalkotKarnataka,BagalkotKarnatakaIndiaInjuredFatalNaN
7BagalkotAmengad PS2016120LingayathLabourerMALEHIREBADAWADAGI,TQ-HUNAGUNDBagalkotKarnataka,BagalkotKarnatakaIndiaInjuredFatalNaN
8BagalkotAmengad PS2016118LingayathFarmerMALEHIREBADAWADAGI,TQ-HUNAGUNDBagalkotKarnataka,BagalkotKarnatakaIndiaInjuredFatalNaN
9BagalkotAmengad PS2016155VOKKALIGAFarmerMALEGUDUR SC,TQ-HUNAGUNDBagalkotKarnataka,BagalkotKarnatakaIndiaDeceasedFatalNaN
District_NameUnitNameYearMonthageCasteProfessionSexPresentAddressPresentCityPresentStatePermanentAddressPermanentCityPermanentStateNationality_NamePersonTypeInjuryTypeInjury_Nature
1465773YadgirYadgiri Women PS20231115BEDARUStudentFEMALER/o Thanagundi,tq dist yadgiriYadgirKarnataka,YadgirKarnatakaIndiaKidnappedNaNNaN
1465774YadgirYadgiri Women PS20231131MADIGANurseFEMALER/o Naykal,Now At Mata Manikeshwari Nagara YadgiriYadgirKarnataka,YadgirKarnatakaIndiacomplainnantAbusedNaN
1465775YadgirYadgiri Women PS20231116BEGADIStudentFEMALER/o M Hosalli,tq dist yadgiriYadgirKarnataka,YadgirKarnatakaIndiaKidnappedNaNNaN
1465776YadgirYadgiri Women PS20231219MUSLIMStudentFEMALER/o Sagar B,Tq Shahapur dist YadgiriYadgirKarnataka,YadgirKarnatakaIndiaMissingNaNNaN
1465777YadgirYadgiri Women PS2024122LAMBANIFarmerFEMALESAMANAPURA SANNA THANDA,YADAGIRYadgirKarnataka,YadgirKarnatakaIndiaRapeNot ApplicableNaN
1465778YadgirYadgiri Women PS2024119CHRISTIANStudentFEMALER/o Hosalli Cross Near Ratnama School,yadgiriYadgirKarnataka,YadgirKarnatakaIndiaMissingNaNNaN
1465779YadgirYadgiri Women PS2024116HOLAYA, HOLER, HOLEYAHouse help - hiredFEMALER/o Talak Village,tq dist yadgirYadgirKarnataka,YadgirKarnatakaIndiaKidnappedNaNNaN
1465780YadgirYadgiri Women PS2024229LingayathTeacherFEMALER/o Bilahar Village,Tq wadagera dist YadgiriYadgirKarnataka,YadgirKarnatakaIndiacomplainnantAbusedNaN
1465781YadgirYadgiri Women PS2024229REDDYHouse help - hiredFEMALER/o Thanagundi Village,Now at Mini Vidanasouda yadgiriYadgirKarnataka,YadgirKarnatakaIndiacomplainnantAbusedNaN
1465782YadgirYadgiri Women PS2024217KABBALIGAStudentFEMALER/o Bandalli,TQ DIST YADGIRYadgirKarnataka,YadgirKarnatakaIndiaKidnappedNaNNaN

Duplicate rows

Most frequently occurring

District_NameUnitNameYearMonthageCasteProfessionSexPresentAddressPresentCityPresentStatePermanentAddressPermanentCityPermanentStateNationality_NamePersonTypeInjuryTypeInjury_Nature# duplicates
1731Bengaluru CityCyber Crime Police Station201970VOKKALIGAFarmerMALE,Bengaluru CityKarnataka,Bengaluru CityKarnatakaIndiacomplainnantNaNNaN39
11289Mangaluru CityMoodabidre PS2021636KURUBPolice officerMALEPOLICE STATION,MOODABIDREMangaluru CityKarnataka,Mangaluru CityKarnatakaIndiacomplainnantNot ApplicableNaN26
15664YadgirKembhavi PS202420VOKKALIGALabourerFEMALER/o:Kalladevanahalli,Tq:HunasagiYadgirKarnataka,YadgirKarnatakaIndiaInjuredMinorNaN24
8252HassanPension Mohalla PS2021559NAYAKPolice officerMALEPSI PMPS,HASSANHassanKarnataka,HassanKarnatakaIndiacomplainnantNot ApplicableNaN20
9131KalaburagiAfzalpur PS201830VOKKALIGAPolice officerMALEAfzalpur Police Station,TQ: AfzalpurKalaburagiKarnataka,KalaburagiKarnatakaIndiacomplainnantFatalNaN20
9134KalaburagiAfzalpur PS201840VOKKALIGAPolice officerMALEAfzalpur Police Station,TQ: AfzalpurKalaburagiKarnataka,KalaburagiKarnatakaIndiacomplainnantFatalNaN20
4033Bengaluru DistNelamangala Traffic PS201780VOKKALIGAFarmerMALE,Bengaluru DistKarnataka,Bengaluru DistKarnatakaIndiaInjuredMinorNaN19
1724Bengaluru CityCyber Crime Police Station2018120VOKKALIGAFarmerMALE,Bengaluru CityKarnataka,Bengaluru CityKarnatakaIndiacomplainnantNaNNaN18
9103KalaburagiAfzalpur PS201760VOKKALIGAPolice officerMALEAfzalpur Police Station,TQ: AfzalpurKalaburagiKarnataka,KalaburagiKarnatakaIndiacomplainnantFatalNaN18
3061Bengaluru CitySampangiramanagar PS2021330LADSelf Employed OthersFEMALENOTKNOW,NOTKNOWBengaluru CityKarnataka,Bengaluru CityKarnatakaIndiaOthersNot ApplicableNaN17